12 research outputs found

    Real-time performance diagnosis and evaluation of big data systems in cloud datacenters

    Get PDF
    PhD ThesisModern big data processing systems are becoming very complex in terms of largescale, high-concurrency and multiple talents. Thus, many failures and performance reductions only happen at run-time and are very difficult to capture. Moreover, some issues may only be triggered when some components are executed. To analyze the root cause of these types of issues, we have to capture the dependencies of each component in real-time. Big data processing systems, such as Hadoop and Spark, usually work in large-scale, highly-concurrent, and multi-tenant environments that can easily cause hardware and software malfunctions or failures, thereby leading to performance degradation. Several systems and methods exist to detect big data processing systems’ performance degradation, perform root-cause analysis, and even overcome the issues causing such degradation. However, these solutions focus on specific problems such as stragglers and inefficient resource utilization. There is a lack of a generic and extensible framework to support the real-time diagnosis of big data systems. Performance diagnosis and prediction of big data systems are highly complex as these frameworks are typically deployed in cloud data centers that are large-scale, highly concurrent, and follows a multi-tenant model. Several factors, including hardware heterogeneity, stochastic networks and application workloads may impact the performance of big data systems. The current state-of-the-art does not sufficiently address the challenge of determining complex, usually stochastic and hidden relationships between these factors. To handle performance diagnosis and evaluation of big data systems in cloud environments, this thesis proposes multilateral research towards monitoring and performance diagnosis and prediction in cloud-based large-scale distributed systems by involving a novel combination of an effective and efficient deployment pipeline.The key contributions of this dissertation are listed below: - i - • Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs). • Developing AutoDiagn, an automated real-time diagnosis framework for big data systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online root-cause analysis for a big data system. • Designing a novel root-cause analysis technique/system called BigPerf for big data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex relationships between performance related factors. The key contributions of this dissertation are listed below: - i - • Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs). • Developing AutoDiagn, an automated real-time diagnosis framework for big data systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online root-cause analysis for a big data system. • Designing a novel root-cause analysis technique/system called BigPerf for big data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex relationships between performance related factors. The key contributions of this dissertation are listed below: - i - • Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs). • Developing AutoDiagn, an automated real-time diagnosis framework for big data systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online root-cause analysis for a big data system. • Designing a novel root-cause analysis technique/system called BigPerf for big data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex relationships between performance related factors.State of the Republic of Turkey and the Turkish Ministry of National Educatio

    RootPath: Root Cause and Critical Path Analysis to Ensure Sustainable and Resilient Consumer-Centric Big Data Processing under Fault Scenarios

    Get PDF
    The exponential growth of consumer-centric big data has led to increased concerns regarding the sustainability and resilience of data processing systems, particularly in the face of fault scenarios. This paper presents an innovative approach integrating Root Cause Analysis (RCA) and Critical Path Analysis (CPA) to address these challenges and ensure sustainable, resilient consumer-centric big data processing. The proposed methodology enables the identification of root causes behind system faults probabilistically, implementing Bayesian networks. Furthermore, an Artificial Neural Network (ANN)-based critical path method is employed to identify the critical path that causes high makespan in MapReduce workflows to enhance fault tolerance and optimize resource allocation. To evaluate the effectiveness of the proposed methodology, we conduct a series of fault injection experiments, simulating various real-world fault scenarios commonly encountered in operational environments. The experiment results show that both models perform very well with high accuracies, 95%, and 98%, respectively, enabling the development of more robust and reliable consumer-centric systems

    Federated-ANN based Critical Path Analysis and Health Recommendations for MapReduce Workflows in Consumer Electronics Applications

    Get PDF
    Although much research has been done to improve the performance of big data systems, predicting the performance degradation of these systems quickly and efficiently remains a significant challenge. Unfortunately, the complexity of big data systems is so vast that predicting performance degradation ahead of time is quite tricky. Long execution time is often discussed in the context of performance degradation of big data systems. This paper proposes MrPath, a Federated AI-based critical path analysis approach for holistic performance prediction of MapReduce workflows for consumer electronics applications while enabling root-cause analysis of various types of faults. We have implemented a federated artificial neural network (FANN) to predict the critical path in a MapReduce workflow. After the critical path components (e.g., mapper1, reducer2) are predicted/detected, root cause analysis uses user-defined functions (UDF) to pinpoint the most likely reasons for the observed performance problems. Finally, health node classification is performed using an ANN-based Self-Organising Map (SOM). The results show that the AI-based critical path analysis method can significantly illuminate the reasons behind the long execution time in big data systems

    HTwitt: a hadoop-based platform for analysis and visualization of streaming Twitter data

    No full text

    MapChain: A Blockchain-based Verifiable Healthcare Service Management in IoT-based Big Data Ecosystem

    Get PDF
    Internet of Things (IoT)-based Healthcare services, which are becoming more widespread today, continuously generate huge amounts of data which is often called big data. Due to the magnitude and intricacy of the data, it is difficult to find valuable information that can be used for decision-making and prediction. Big data systems take on a significant infrastructure service to better serve the purpose of IoT systems and support critical decision making. On the other hand, privacy preservation, data integrity, and identity verification are essential requirements in healthcare big data service management. To overcome these problems, this article offers a scalable computing system that provides verifiable data access mechanism for IoT-enabled health data analytics in the big data ecosystem. There are two primary sub-architectures in the proposed architecture, namely a big data analytics tracking system and a derived blockchain-based data storage/access system. This approach leverages big data systems and blockchain architecture to analyze, and securely store data from IoT-enabled devices and allow verified access to the stored data. The zero-knowledge protocol is used to ensure that no information is accessible to unauthenticated users alongside avoiding data linkability. The results demonstrate the effectiveness of the our method to solve the problems of big data analytics and privacy issues in healthcare

    SmartMonit: Real-time Big Data Monitoring System

    No full text
    corecore